DBTMPE: Deep Bidirectional Transformers-Based Masked Predictive Encoder Approach for Music Genre Classification

نویسندگان

چکیده

Music is a type of time-series data. As the size data increases, it challenge to build robust music genre classification systems from massive amounts Robust require large labeled data, which necessitates time- and labor-intensive data-labeling efforts expert knowledge. This paper proposes musical instrument digital interface (MIDI) preprocessing method, Pitch Vector (Pitch2vec), deep bidirectional transformers-based masked predictive encoder (MPE) method for classification. The MIDI files are considered as input. converted vector sequence by Pitch2vec before being input into MPE. By unsupervised learning, MPE based on transformers designed extract representations automatically, musicological insight. In contrast other deep-learning models, such recurrent neural network (RNN)-based enables parallelization over time-steps, leading faster training. To evaluate performance proposed experiments were conducted Lakh dataset. During training, approximately 400,000 segments utilized MPE, recovery accuracy rate reached 97%. task, indicators more than 94%. experimental results indicate that improves compared with state-of-the-art models.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Music Genre Classification using Masked Conditional Neural Networks

The ConditionaL Neural Networks (CLNN) and the Masked ConditionaL Neural Networks (MCLNN) exploit the nature of multi-dimensional temporal signals. The CLNN captures the conditional temporal influence between the frames in a window and the mask in the MCLNN enforces a systematic sparseness that follows a filterbank-like pattern over the network links. The mask induces the network to learn about...

متن کامل

Deep Belief Networks for Automatic Music Genre Classification

This paper proposes an approach to automatic music genre classification using deep belief networks. Based on the restricted Boltzmann machines, the deep belief networks is constructed and takes the acoustic features extracted through content-based analysis of music signals as input. The model parameters are initially determined after the deep belief network is trained by greedy layer-wise learn...

متن کامل

"multilingual" Deep Neural Network for Music Genre Classification

Multilingual deep neural network (DNN) has been widely used in low-resource automatic speech recognition (ASR) in order to balance the rich-resource and low-resource speech recognition or to build the low-resource ASR system quickly. Inspired by the idea of using multilingual DNN for ASR, we use a “multilingual” DNN (Multi-DNN) for music genre classification. However, we do not have “multilingu...

متن کامل

Music Genre Classification: A Multilinear Approach

In this paper, music genre classification is addressed in a multilinear perspective. Inspired by a model of auditory cortical processing, multiscale spectro-temporal modulation features are extracted. Such spectro-temporal modulation features have been successfully used in various content-based audio classification tasks recently, but not yet in music genre classification. Each recording is rep...

متن کامل

A Novel Approach to Music Genre Classification

Recently, with the construction of digital music libraries, it is important to efficiently manage a large music database. It will be helpful to provide a content-based music genre classification system for managing a large database. Therefore, in this paper we will propose two novel music features, low-frequency energy ratio (LFER) and energy domain signal coding (EDSC), for music genre classif...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Mathematics

سال: 2021

ISSN: ['2227-7390']

DOI: https://doi.org/10.3390/math9050530